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INTRODUCTION 


Large-scale  software  projects  necessarily  involve 
communications  among  individuals  with  diverse  skills  and 
experience.  Software  design,  coding,  and  maintenance  are 
commonly  performed  by  a  variety  of  individuals  at  different 
points  in  time.  The  efficiency  with  which  software-related 
tasks  are  performed  depends  critically  on  the  documentation 
supplied  from  the  previous  phases  of  the  software  life  cycle. 
The  purpose  of  this  research  is  to  empirically  evaluate  a 
number  of  different  documentation  formats.  Previous 
experiments  in  this  series  have  examined  the  effects  of  these 
formats  on  comprehension  and  coding  performance.  The  current 
experiment  investigated  performance  in  a  debugging  task. 

Empirical  Evaluation  of  Software  Documentation  Formats 

There  has  been  a  continued  interest  in  the  relative  value 
of  flowcharts,  program  design  language  (PDL) ,  and  English  prose 
as  software  development  and  documentation  tools.  An  early 
empirical  assessment  of  the  value  of  flowcharts  in  programming 
was  reported  by  Shneiderman,  Mayer,  McKay  and  Heller  (1977). 
They  performed  a  series  of  experiments  on  the  composition, 
comprehension,  debugging  and  modification  of  programs.  For  the 
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were  asked  to  write  a 


composition  task,  the  participants 
program;  some  were  also  asked  to  produce  a  flowchart  in 
addition  to  the  program.  For  the  comprehension,  debugging,  and 


modification  tasks, 

all 

participants  were 

given 

a  program 

listing  while  some 

were 

given  a  flowchart 

as 

an 

additional 

aid.  Shneiderman  et 

al. 

found  no  significant 

differences  in 

any  of  their  exper  iments  '  between  groups  that  did  and  did  not 
use  flowcharts. 

In  another  study,  Ramsey,  Atwood,  and  Van  Doren  (1978) 
compared  the  effectiveness  of  flowcharts  to  that  of  a  program 
design  language.  In  one  experiment,  programmers  expressed  a 
design  in  either  a  flowchart  or  PDL.  In  a  second  experiment, 
programmers  produced  code  from  designs  expressed  in  either  a 
flowchart  or  PDL. '  Ramsey  et  al.  found  no  difference  in 
performance  on  the  tasks  in  either  experiment.  However,  the 
designs  expressed  in  a  PDL  were  judged  to  be  of  superior 
quality  in  that  they  included  greater  algorithmic  detail,  more 
modularization,  and  less  abbreviation  of  variable  names  than 
those  expressed  as  flowcharts. 

Brooke  and  Duncan  (1980)  compared  flowcharts  and  sequential 
instructions  as  debugging  tools.  They  concluded  that 
flowcharts  were  useful  for  tracing  execution  sequences  in  a 
program  but  were  not  helpful  in  conceptualizing  relationships 
among  non-contiguous  segments  of  the  program. 
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Although  studies  performed  on  software-related  tasks  have 
not  been  especially  favorable  to  flowcharts,  experiments 
performed  in  other  areas  of  information  presentation  have 
demonstrated  an  advantage  for  flowcharts  over  alternative 
presentation  formats  including  prose  descriptions,  short 
sentences,  and  decision  tables  (Wright  and  Reid,  1973;  Blaiwes, 
1974;  Kammann,  1975).  Kammann,  for  example,  presented 
participants  with  a  set  of  telephone  dialing  problems.  The 
dialing  instructions  were  presented  in  the  form  of  a  prose 
description  or  a  flowchart.  Fewer  errors  were  made  with  the 
flowchart.  [For  a  review  of  the  non-software  research,  see 
Sheppard,  Kruesi,  and  Curtis  (1981)]. 

An  experiment  recently  reported  by  Miller  (1981)  raises 
some  doubts  about  the  advisability  of  natural  language  as 
either  a  development  or  documentation  tool.  Miller  asked 
non-programmers  to  write  procedures  for  solving  problems  that 
were  representative  of  common  computer  applications.  Careful 
analysis  of  the  protocols  led  Miller  to  conclude  that  even 
minor  increases  in  the  complexity  of  problems  led  to  marked 
decreases  in  the  quality  of  the  solutions.  Further,  the  high 
degree  of  contextual  referencing  found  in  the  solutions 
provided  doubts  about  the  feasibility  of  adequate  natural 
language  specifications.  Miller  suggests  that  we  would  improve 
the  quality  of  programs  "...with  tools  that  structure  the 
problem  and  the  implementation  processes"  (p.212). 
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Characteristics  of  Software  Documentation 

The  studies  described  above  have  involved  an  analysis  of 
documentation  formats  currently  in  use.  A  comparison  of  any 
two  or  more  formats,  such  as  PDL  and  flowcharts,  may  yield 
useful  information  about  the  relative  value  of  these  formats. 
This  comparison  does  not,  however,  allow  us  to  isolate  the 
source  of  any  observed  differences  since  documentation  formats 
vary  along  more  than  one  dimension. 

In  general,  there  are  two  primary  dimensions  for 
categorizing  how  available  documentation  aids  configure  the 
information  they  present  to  programmers  (Jones,  1979) .  The 
first  dimension  is  the  type  of  symbology  in  which  information 
is  presented.  The  second  dimension  is  the  spatial  arrangement 
of  this  information.  PDL ,  for  example,  uses  constrained 
language  as  the  symbology  presented  in  a  sequential  spatial 
arrangement.  Flowcharts  use  ideogram  symbols  presented  in  a 
branching  spatial  arrangement.  Thus,  any  differences  observed 
in  the  effectiveness  of  PDL  and  flowcharts  may  be  due  to  the 
differences  in  the  symbols,  in  the  spatial  arrangement  or  to  an 
interaction  of  these  two  dimensions. 

Our  approach  to  evaluating  various  forms  of  documentation 
is  to  investigate  the  separate  and  combined  effects  of  these 
two  dimensions.  Specifically,  we  have  factorially  combined 
three  types  of  symbols  with  three  spatial  arrangements  to 
produce  nine  different  formats. 


Type  of  Symbology.  The  symbology  dimension  includes 

natural  language,  constrained  language,  and  ideograms.  Docu¬ 
mentation  in  the  form  of  natural  language  is  frequently  found 

embedded  in  the  source  code  as  either  global  or  in-line 

comments.  Constrained  language,  which  is  embodied  in  a  Program 
Design  Language  (PDL) ,  is  more  succinct  than  natural  language, 
using  strictly  defined  keywords  to  describe  arguments  or 
predicates.  Ideograms  are  frequently  found  in  flowcharts  and 
HIPO  charts  (Bohl,  1971;  Katzen,  1976).  A  standard  set  of 
ideograms  has  come  to  represent  processes  or  entities  within  a 
program. 

Spatial  Arrangement.  The  spatial  arrangement  of  infor¬ 
mation  in  documentation  is  a  second  dimension  along  which 
documentation  techniques  can  be  categorized.  In  the  current 
experiment,  this  dimension  is  represented  by  a  sequential,  a 

branching,  and  a  hierarchical  arrangement.  A  sequential 
arrangement  is  typical  of  narrative  description,  program 
listings  and  PDL  while  a  branching  arrangement  is  typical  of 
flowcharts.  A  hierarchical  arrangement  is  not  generally  used 
for  individual  module  specifications  but,  rather,  at  the  system 
level  to  present  a  visual  display  of  the  relationship  among 
modules . 

This  report  describes  the  third  in  a  series  of  experiments 
to  investigate  the  effects  of  the  type  of  symbology  and  the 
spatial  arrangement.  For  all  experiments,  the  three  types  of 
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and 


symbology  (natural  language,  constrained  language, 
ideograms)  are  factorially  combined  with  the  three  spatial 
arrangements  to  produce  nine  different  documentation  formats. 
The  first  experiment,  which  is  described  in  Sheppard,  Kruesi, 
and  Curtis  (1981),  investigated  comprehension  performance.  The 
second  experiment  examined  the  influence  of  these  dimensions  on 
the  ability  of  programmers  to  translate  the  specifications  into 
code  (Sheppard  &  Kruesi,  1981).  This  experiment  examined  the 
effects  of  these  dimensions  on  performance  in  a  debugging 
task.  The  results  of  the  first  two  experiments  are  described 
briefly  in  the  following  sections. 

Effects  of  Symbology  and  Spatial  Arrangement 
on  Comprehension 

In  the  first  experiment,  seventy-two  professional  pro¬ 
grammers  were  presented  with  specifications  for  each  of  three 
modular-sized  computer  programs.  The  participants  answered  a 
series  of  comprehension  questions  for  each  program  using  only 
the  specifications.  The  questions  were ’  presented  interactively 
on  a  CRT  and  consisted  of  three  different  types.  For 
forward- trac ing  questions,  the  participants  were  given  the 
values  for  a  set  of  conditions  in  the  program.  Their  task  was 
to  trace  through  the  specifications  and  find  the  first 
statement  executed  under  those  conditions.  For 

backward -trac ing  questions,  they  were  required  to  locate  a 
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input-output  questions,  they  were  given  input  data  and  were 
asked  to  determine  the  value  of  particular  variables  at  a  later 
point  in  the  program. 

Both  forward  and  backward- tracing  questions  were  answered 
more  quickly  from  specifications  presented  in  constrained 
language  or  ideograms  than  in  natural  language.  On  the 
average,  f orward-tracing  questions  were  answered  most  quickly 
from  a  branching  arrangement  and  backward-tr ac ing  questions 
were  answered  more  quickly  from  the  branching  and  hierarchical 
arrangements.  An  examination  of  the  individual  formats 
revealed  that  the  sequential  constrained  language  (normal  PDL) , 
the  branching  constrained  language  and  the  branching  ideogram 
(normal  flowchart)  versions  were  associated  with  very  quick 
responses  for  both  types  of  questions.  For  the  input-output 
questions,  no  significant  differences  were  found  as  a  function 
of  the  type  of  symbology  or  the  spatial  arrangement.  At  the 
conclusion  of  the  experimental  session,  participants  were  asked 
to  list  the  type  of  symbology  and  the  spatial  arrangement  they 
most  preferred.  Constrained  language  was  the  most  preferred 
symbology  and  the  branching  spatial  arrangement  was  the  most 
preferred  arrangement. 

Effects  of  Symbology  and  Spatial  Arrangement 
in  a  Coding  Task 

In  the  second  experiment  (Sheppard  &  Kruesi,  1981), 
thirty-six  professional  programmers  were  presented  with 
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specifications  and  partially  completed  code  for  the  same  three 


programs.  The  participants  constructed  a  section  of  code  at 
the  middle  of  each  program.  These  sections  contained  about 
fifteen  lines  and  included  the  most  complex  decision  structures 
present  in  the  programs.  The  code  was  completed  using  a  text 
editor,  and  the  participants  were  asked  to  submit  the  program 
for  compilation  and  .execution.  If  the  program  did  not  run 
correctly,  they  were  asked  to  correct  the  errors  and  submit  it 
again. 

Substantial  differences  in  coding  time  were  associated  with 
the  type  of  symbology.  The  natural  language  was  considerably 

more  difficult  to  code  from  than  the  constrained  language  or 
ideograms.  An  examination  of  the  error  data  showed  that  these 
differences  were  due  both  to  errors  in  coding  the  control  flow 
and  errors  related  to  assignment  statements  and  variables.  The 
effect  of  the  spatial  arrangement  was  not  as  great  as  the 

effect  of  symbology.  Although  not  statistically  significant, 
the  branching  arrangement  appeared  to  be  superior  to  the 

sequential  and  hierarchical  arrangements  in  minimizing 
control-flow  errors.  A  comparison  of  the  individual  formats 
revealed  that  the  constrained  language  presented  in  a 

sequential  or  in  a  branching  arrangement  resulted  in  the 

highest  level  of  performance. 
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Again,  constrained  language  was  preferred  by  more 
participants  than  ideograms  or  natural  language,  and  branching 
was  the  preferred  spatial  arrangement. 

Debugging 

The  current  experiment  compared  the  same  nine  formats  in  a 
debugging  task.  The  participants  were  given  specifications  for 
each  of  three  modular-sized  programs  (about  50  lines  of  code). 
They  compared  these  specifications  to  error-seeded  program 
listings.  Their  task  was  to  locate  and  correct  the  errors 
using  a  text  editor.  Performance  was  measured  by  the  time 


required  to  detect  and  correct  the  errors  and  by  the  number  of 
submissions  required  for  a  correct  run. 


METHOD 


Participants 

Thirty-six  professional  programmers  from  two  different 
locations  participated  in  this  experiment.  All  were  General 
Electric  employees.  The  participants  averaged  6.2  years  of 
professional  programming  experience  (S.D.  ==  4.9)  and  had  used 

an  average  of  5  programming  languages  (S.D.  =  2.3). 

Independent  Variables 

The  experiment  was  designed  to  study  the  effects  of  three 
independent  variables:  the  type  of  symbology,  the  spatial 
arrangement  of  the  information,  and  the  type  of  program. 

Program  type.  In  our  previous  research  (Sheppard,  Curtis, 
Milliman  &  Love,  1979)  significant  differences  in  programmer 
performance  were  often  associated  with  differences  among 
programs.  Three  programs  of  varying  types  were  chosen  for  use 
in  this  experiment.  (These  three  programs  were  used  in  the 
first  two  experiments  as  well.)  A  program  which  calculated  the 
trajectory  of  a  rocket  was  chosen  as  representative  of  an 
engineering  algorithm.  An  inventory  system  for  a  grocery 
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distribution  center  represented  the  class  of  programs  that 

manipulate  data  bases.  A  third  program  combined  these  two 
types  of  applications.  This  program  interrogated  a  data  base 
for  information  concerning  the  traffic  pattern  at  an  airport 
and  simulated  future  needs  using  a  queuing  algorithm. 

These  three  programs  were  based  on  algorithms  contained  in 
Barrodale,  Roberts,  and  Ehle  (1971).  The  algorithms  were 

modified  to  incorporate  only  the  constructs  of  sequence, 
structured  iteration,  and  structured  selection.  They  were  then 
coded  in  Fortran  and  verified  for  correctness.  Each  of  the 
resulting  programs  contained  approximately  50  lines  of 

executable  code.  In  addition  a  short  algorithm  (18  lines)  to 

find  the  largest  of  three  integers  was  used  as  a  practice 
program. 

The  practice  program  was  modified  to  contain  one  error. 
The  experimental  programs  each  contained  three  errors.  The 
errors  were  selected  from  among  errors  made  in  the  coding 
experiment,  which  had  used  the  same  experimental  materials. 
The  errors  included  both  transfer  of  control  and 
assignment/variable  errors  but  did  not  include  syntax  errors. 
Listings  of  the  incorrect  programs  are  shown  in  Appendix  A. 
Handwritten  corrections  are  included  for  the  reader's  benefit. 
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Type  of  Symbology.  The  statements  from  each  program  were 
translated  into  detailed  specifications.  Three  types  of 
symbology  were  used:  natural  language,  constrained  language, 
and  ideograms.  A  consistent  set  of  rules  was  used  to  map 
assignment,  selection,  and  iteration  statements  across  the 
three  types  of  symbology. 


Spatial  Arrangements.  Three  spatial  arrangements  were  used 
to  represent  the  program  structure:  sequential,  branching,  and 
hierarchical.  These  three  arrangements  differed  in  the 
representation  of  control  flow  and  nesting  levels.  In  the 
sequential  arrangement,  both  the  control  flow  and  the  levels  of 
nesting  were  represented  vertically.  In  the  branching 
arrangement,  the  flow  of  control  was  represented  vertically 
while  nesting  levels  were  represented  horizontally.  Finally, 
in  the  hierarchical  arrangement,  the  flow  of  control  was 
represented  horizontally  while  nesting  levels  were  represented 
vertically. 


Each  of  the  three  types  of  symbology  was  presented  in  the 
three  spatial  arrangements,  resulting  in  nine  specification 
formats  for  each  program.  Examples  of  the  nine  forms  for  the 
rocket  trajectory  program  may  be  found  in  the  first  technical 


report  of  this  series  (Sheppard,  Kruesi,  and  Curtis,  1980) 
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Procedure 


Prior  to  the  experiment,  the  participants  were  given  a 
20-minute  training  session  in  which  they  were  shown  each 
spatial  arrangement  and  each  type  of  symbology.  The 
experimenter  described  the  control  flow  for  each  arrangement 
using  a  sorting  program  as  an  example;  this  program  was  not 
seen  in  the  actual  experiment.  The  procedure  for  using  the 
text  editor  to  correct  the  programs  was  also  explained  in 
detail  during  the  training  session. 

Experimental  sessions  were  conducted  at  CRT  terminals  on  a 
VAX  11/780.  All  coding  was  done  in  Fortran.  The  participants 
were  first  given  a  practice  program  containing  a  single  error. 
Identical  listings  of  the  code  appeared  on  the  CRT  screen  and 
on  a  paper  printout.  The  participants  were  told  there  was  one 
error  and  were  asked  to  correct  the  code,  using  the  text 
editor.  When  satisfied  that  the  program  would  perform 
correctly,  a  participant  exited  from  the  editor  and  activated  a 
command  file  to  compile  and  run  the  program.  If  the 
compilation  was  unsuccessful,  a  compiler  message  appeared  on 
the  screen  directly  below  the  line  or  lines  containing  the 
error.  If  the  program  compiled  without  errors,  it  was 
automatically  executed  with  test  data,  and  the  output  from  the 
program  appeared  on  the  screen  with  one  of  the  following 
messages:  "OUTPUT  IS  CORRECT"  or  "OUTPUT  IS  INCORRECT."  In 
the  latter  case,  the  participant  was  asked  to  keep  trying  until 
the  program  was  correct. 
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Following  the  practice  program,  the  three  experimental 
programs  were  presented.  For  each  program,  the  participants 
received  a  correct  version  of  the  specifications;  these  were 
contained  on  a  single  piece  of  paper.  In  addition,  they 
received  identical  listings  of  the  error-seeded  code  on  the  CRT 
screen  and  on  a  paper  printout.  They  also  received  •  a  data 
dictionary  listing  each  variable,  a  natural  language 
description  of  it,  and  its  data  type. 

The  participants  were  told  that  there  were  several  errors 
in  each  experimental  program  and  that  all  of  them  were  located 
in  the  center  section  of  the  code,  labeled  the  "COMPUTATION" 
section  (See  Appendix  A)  .  They  were  instructed  to  compare  the 
specifications  to  the  code,  locate  the  errors  and  correct 
them.  If  a  participant  tried  running  the  program  without 
making  any  changes,  the  program  compiled  successfully  but 
produced  the  message  that  the  output  was  incorrect. 

An  interactive  data  collection  system  prompted  the 
participant  throughout  the  experimental  procedure.  The  system 
recorded  each  change  made  to  a  program.  An  interval  timer, 
accurate  to  the  nearest  second,  recorded  the  time  for  each 
action.  When  a  participant  required  more  than  one  editing 
session  to  locate  and  correct  the  errors,  the  experimental 
system  recorded  exits  from  the  editor,  any  compilation  errors, 


and  the  incorrect  outputs  generated.  From  these  data,  the  time 
to  debug  the  programs  was  calculated  by  summing  the  times  from 
the  individual  editing  sessions;  time  for  compiling  and  running 
the  programs  was  not  included. 


On  the  average,  the  participants  spent  approximately  16 
minutes  on  each  experimental  program.  They  were  required  to 
continue  working  on  a  program  until  all  errors  had  been  located 
and  corrected.  They  were  allowed  to  take  breaks  between 
programs. 


Following  the  experiment,  the  participants  completed  a 
questionnaire  about  their  previous  programming  experience.  The 
information  requested  included  number  of  years  of  professional 
experience,  number  of  programming  languages  known,  and  whether 
they  had  previously  worked  with  algorithms  of  the  types  used  in 
the  experiment.  The  participants  were  also  asked  about  their 
preferences  for  type  of  symbology  and  spatial  arrangement. 


Design 


The  three  types  of  symbology  (natural  language,  constrained 
language,  and  ideograms)  were  factorially  combined  with  the 
three  spatial  arrangements  (sequential,  branching,  and 
hierarchical)  to  produce  nine  specification  formats.  These 
nine  formats  were  constructed  for  each  of  the  three  programs, 
resulting  in  a  total  of  27  conditions. 
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Participants  received  a  set  of  specifications  for  each 
program.  Across  the  three  programs,  they  saw  each  type  of 
symbology  and  each  spatial  arrangement.  The  first  participant, 
for  example,  saw  the  rocket  trajectory  program  presented  in 
sequential  natural  language,  the  inventory  control  program  in 
hierarchical  constrained  language,  and  the  airport  traffic 
program  in  branching  ideograms.  The  participants  were  assigned 
to  conditions  according  to  the  procedures  outlined  in  Winer 
(1971).  [See  also  Kirk  (1968)].  Each  of  the  27  conditions  was 
used  once  within  a  set  of  nine  participants .  For  this  3J 
randomized  block  design,  a  minimum  of  36  participants  is 
required  to  assess  all  interactions  and  main  effects.  Across 
the  36  participants,  each  program,  symbology,  and  arrangement 
was  presented  first,  second,  and  third  an  equal  number  of  times. 
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RESULTS 


Debugging  Task 


The  participants  required  an  average  of  16  minutes  to  debug 
a  program.  This  represents  the  amount  of  time  spent  studying 
the  program  and  using  the  text  editor  (i.e.,  the  total  time 
spent  at  the  terminal  less  the  time  for  compiling,  linking  and 
running) . 


i 


There  were  no  differences  among  the  times  to  debug  the 
three  programs.  The  rocket  program  required  an  average  of  15.7 
minutes,  the  airport  program  15.8  minutes  and  the  inventory 
program  16.0  minutes. 


There  was  a  significant  difference  among  the  types  of 
symbology.  The  natural  language  versions  required  18.7  minutes 
as  compared  to  14.5  minutes  for  the  constrained  language  and 
14.2  minutes  for  the  ideograms  (Table  1).  This  difference  was 
verified  by  an  analysis  of  variance  (p  <  .05)  (See  Table  2). 
For  this  analysis,  a  logarithmic  transformation  was  carried  out 
on  the  times  to  attenuate  the  influence  of  extreme  scores  and 
to  produce  a  more  normal  distribution  (Kirk,  1968) . 
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Table  1.  Time  to  Debug  (Minutes) 


SPATIAL 

ARRANGEMENT 

TYPE  OF  SYMBOLOGY 

NATURAL 

LANGUAGE 

CONSTRAINED 

LANGUAGE 

IDEOGRAMS 

TOTAL 

SEQUENTIAL 

19.8 

12.1 

18.2 

16.7 

BRANCHING 

18.2 

14.6 

14.6 

15.8 

HIERARCHICAL 

18.1 

16.7 

9.8 

14.9 

TOTAL 

18.7 

14.5 

14.2 

15.8 

Note:  Individual  cell  means  represent  12  participants. 


Table  2.  Summary  of  ANOVA 
Time  to  Debug 


SOURCE 

if. 

Si 

MS 

_F_ 

E 

TOTAL 

107 

3.61 

BETWEEN  PARTICIPANTS 

AND  REPLICATIONS 

REPLICATIONS 

3 

.15 

PARTICIPANTS  WITHIN 

REPLICATIONS 

32 

.02 

WITHIN  PARTICIPANTS 

AND  REPLICATIONS 

PROGRAM  (P) 

2 

.01 

.01 

.20 

SYMBOLOGY  (S) 

2 

.37 

.18 

3.60 

.05 

ARRANGEMENT  (A) 

2 

.03 

.01 

.20 

P  *  S 

4 

.17 

.04 

.80 

P  *  A 

4 

.07 

.02 

.40 

S  *  A 

4 

.23 

.06 

1  20 

P  *  S  *  A 

8 

.24 

.03 

.60 

RESIDUAL 

46 

2.32 

.05 
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The  effect  of  the  spatial  arrangement  was  not  significant, 


and  there  were  no  significant  interactions. 


Number  of  Submissions 


All  of  the  errors  in  the  programs  were  successfully  located 
and  corrected  by  all  of  the  participants.  An  average  of  2.0 
submissions  were  required  to  run  the  programs  correctly.  As 
with  the  debugging  times,  there  were  no  differences  in  number 
of  submissions  across  programs. 

Table  3  presents  the  number  of  submissions  broken  down  by 
type  of  symbology  and  spatial  arrangement.  Unlike  the 
debugging  times,  there  were  no  significant  differences  for  type 
of  symbology.  An  analysis  of  variance  indicated  no  significant 
main  effects  or  interactions. 

Preferences  for  Type  of  Symbology  and  Spatial  Arrangement 

Across  the  three  programs,  each  participant  received 
specifications  in  each  type  of  symbology  and  in  each  spatial 
arrangement.  The  questionnaire  indicated  which  three  of  the 
nine  versions  they  had  experienced  during  the  experiment.  They 
were  asked  to  state  which  of  the  three  versions  they 
preferred.  Table  4  shows  these  preferences. 


Table  3.  Number  of  Submissions 
Required  to  Complete  Task 


SPATIAL 

ARRANGEMENT 

TYPE  OF  SYMBOLOGY 

NATURAL 

LANGUAGE 

CONSTRAINED 

LANGUAGE 

IDEOGRAMS 

TOTAL 

.  SEQUENTIAL 

1.8 

1.9 

2.5 

2.1 

BRANCHING 

2.2 

1.9 

1.8 

2.0 

HIERARCHICAL 

1.6 

1.8 

1.3 

1.8 

TOTAL 

1.9 

1.9 

2.2 

2.0 

Mote:  Individual  cell  means  represent  12  participants. 


Table  4.  Percent  of  Preferences 
for  Symbology  and  Spatial  Arrangement 


SPATIAL 

ARRANGEMENT 

TYPE  OF  SYMBOLOGY 

NATURAL 

LANGUAGE 

CONSTRAINED 

LANGUAGE 

IDEOGRAMS 

TOTAL 

SEQUENTIAL 

9 

9 

6 

24 

BRANCHING 

15 

18 

25 

58 

HIERARCHICAL 

9 

6 

3 

18 

TOTAL 

33 

33 

34 

100 
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The  three  types  of  symbology  were  preferred  equally  often. 
In  terms  of  the  spatial  arrangement,  branching  was  the  most 
preferred,  sequential  was  intermediate  and  hierarchical  was  the 


least  preferred. 


Experiential  Factors 


The  questionnaire  also  asked  for  the  number  of  years  the 
participants  had  programmed  professionally  and  the  number  of 
programming  languages  they  had  used.  No  correlation  was  found 
between  years  of  experience  and  time  to  debug.  Number  of 
languages  and  debugging  time  were  correlated  -.26  (p  <  .06), 
indicating  that  programmers  who  had  experience  with  a  greater 
number  of  programming  languages  performed  the  tasks  in  this 
experiment  more  quickly. 
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DISCUSSION 


The  same  three  programs  were  used  in  the  current  experiment 
as  in  the  comprehension  and  coding  experiments.  In  the  earlier 
experiments,  significant  differences  in  performance  were 
associated  with  these  three  programs.  Specifically,  the 
airport-scheduling  program  was  considerably  more  difficult  than 
the  inventory-control  or  rocket- trajectory  programs.  In  the 
current  experiment,  no  differences  were  observed  in  performance 
across  the  three  programs.  One  possible  explanation  for  this 
equality  is  that  the  relative  difficulty  of  the  errors  exactly 
compensated  for  the  inherent  difficulty  of  the  programs.  Thus, 
the  errors  seeded  in  the  airport-scheduling  program  may  have 
been  easier  errors  to  detect  and  correct  than  those  seeded  in 
the  remaining  two  programs.  This  "balancing"  explanation 
appears  unlikely  since  the  types  of  errors  (transfer  of  control 
and  assignment/variable)  and  their  physical  locations  were 
similar  across  programs. 

Another  possible  explanation  is  that  debugging  a  program 
from  detailed  specifications  which  are  known  to  be  correct  does 
not  require  as  much  knowledge  of  the  intricacies  of  the 
algorithm  as  does  comprehending  the  specifications  or  coding 
from  the  specifications.  Thus,  the  inherent  difficulty  of  the 
algorithm  may  be  less  important  in  this  type  of  a  debugging 
task  than  in  the  earlier  comprehension  and  coding  tasks. 


-Page  22- 


Differences  in  the  type  of  symbology  followed  the  pattern 
established  in  the  first  two  experiments:  the  natural  language 
versions  resulted  in  significantly  longer  response  times  than 
the  constrained  language  and  ideogram  versions.  Had  the 
natural  language  been  written  casually,  one  could  hypothesize 
that  it  was  incomplete  and  misleading.  However,  the  natural 
language  was  developed  very  precisely.  Assignment,  selection 
and  iteration  statements  were  translated  from  the  original  code 
into  the  three  types  of  symbology  according  to  a  rigid  set  of 
rules  to  insure  that  the  natural  language  specifications  were 
as  complete  and  precise  as  the  constrained  language  and 
ideograms.  It  is  reasonable  to  conclude,  therefore,  that  the 
differences  were  due  to  real  differences  among  the  types  of 
symbology  rather  than  to  an  experimental  artifact.  When 
combined  with  identical  conclusions  from  the  two  previous 
experiments  in  this  series,  this  result  presents  strong 
evidence  that  detailed  program  specif ications  should  be 
presented  in  a  more  succinct  symbology  than  natural  language. 

No  pronounced  effect  for  spatial  arrangement  appeared  in 
this  experiment.  This  result  agrees  with  results  from  the 
coding  experiment,  where  time  to  code  and  debug  showed  no 
significant  effect  due  to  spatial  arrangement. 

The  comprehension  experiment  differed  from  this  experiment 
and  the  coding  experiment  in  that  there  were  differences  among 
the  spatial  arrangements.  Forward- trac  ing  questions  were 
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answered  most  quickly  from  the  branching  arrangement,  and 
backward-tracing  questions  were  answered  more  quickly  from  the 
branching  and  hierarchical  arrangements.  Response  times  for 
input-output  questions  did  not  vary  significantly  as  a  function 
of  spatial  arrangement.  One  explanation  for  the  differing 
results  among  the  experiments  is  that  programming  activities 
relating  to  control  flow  (such  as  tracing)  benefit  from  the 
more  pictorial  branching  and  hierarchical  arrangements,  while 
other  activities  are  not  affected  by  the  spatial  arrangement. 
This  explanation  is  supported  by  the  Brooke  and  Duncan  results 
presented  in  the  Introduction. 


One  interesting  result  found  in  all  three  experiments  was 
that  the  sequential  and  branching  constrained  language  versions 
were  consistently  associated  with  low  response  times  and  a 
small  number  of  errors.  In  cases  where  another  version  was 
associated  with  a  lower  response  time  (e.g.  the  hierarchical 
ideogram  version  in  this  experiment),  differences  among  the  two 
constrained  language  versions  and  the  other  version  were  not 
statistically  significant.  Of  the  software  specifications 
currently  in  use  (i.e.  natural  language,  PDL ,  and  flowcharts), 
it  appears  that  PDL  results  in  faster  and  less  error-prone 
performance  than  natural  language  specifications;  flowcharts 
appear  in  between.  Sequential  PDL  has  the  additional  advantage 
of  being  easy  to  produce  at  a  terminal  and  easy  to  read 


automatically . 
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The  participants  in  this  experiment  had  no  distinct 
preference  for  any  ot  the  three  types  of  symbology.  This 
result  was  surprising  because  in  -.he  previous  two  experiments 
constrained  language  was  preferred,  ideograms  were  second  and 
natural  language  was  least  preferred.  As  in  the  previous 
experiments,  the  branching  arrangement  was  the  most  preferred, 
the  sequential  arrangement  was  intermediate  and  the  hier¬ 
archical  arrangement  was  preferred  least. 

Diversity  of  experience,  in  terms  of  the  number  of  lan¬ 
guages  used,  was  a  better  predictor  of  performance  than  years 
of  experience.  This  result  replicates  results  from  the  compre¬ 
hension  experient  and  our  previous  research  (Sheppard,  Milliman 
&  Curtis,  1979)  and  highlights  the  importance  of  ensuring  that 
programmers  have  an  opportunity  to  gain  broad  applications 
experience  as  part  of  their  professional  development. 

This  experiment  provides  additional  evidence  that  specifi¬ 
cation  format  can  have  a  significant  effect  on  the  performance 
of  programmers  on  software-related  tasks.  A  debugging  task  was 
carried  out  more  quickly  from  specifications  presented  in  a 
succinct  symbology.  An  examination  of  the  individual  cell 
means  revealed  four  formats  that  led  to  a  high  level  of  perfor¬ 
mance.  These  were  the  constrained  language  presented  in  a 
sequential  and  in  a  branching  arrangement  and  the  ideograms 
presented  in  a  branching  and  in  a  hierarchical  arrangement. 
Natural  language  led  to  consistently  poor  performance, 
regardless  of  the  spatial  arrangement. 
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APPENDIX  A 


ERROR-SEEDED  PROGRAM  LISTINGS 


r 


PRACTICE  PROGRAM 

Find  the  largest  of  thru  integers,  I,  J,  &  K. 


c 

OPEN ( UNIT® 

1,  NAME*  'PRAC.  DAT 

10 

READ 

(1,60)  I, 

J.  K 

15 

IF  (I 

.  GT. 

J) 

GO  TO  20 

20 

IF  (J 

.  GT 

K> 

CO  TO  10 

25 

LARGE 

-  K 

S* - N 

30 

CD  TO 

40 

35 

10  < 

Lk  >-F7 

TV 

40 

GOTO 

40 

—  .  — 

45 

20 

IF  (I 

.  GT 

K) 

GO  TO  30 

50 

LARVsC. 

=  K 

55 

GO  TO 

40 

oC 

30 

LARGE 

=  I 

65 

40 

PRINT 

70. LARGE 

70 

CLOSE (UNIT 

=  1  ) 

75 

bO 

FOR  HA' 

T  (31 

2) 

SC 

70 

FORMAT  <10X,  'LARGEST  *  I 

85 

STOP 

or 

•  END 

ROCKET  PROGRAM 


5 

10 

15 

20  C 
25  C 
30  C 
35  C 
40  C 
45 
50 
55 
60 
65 
70 
75 
80 
85 
90 
95 
100 
105 
110 
115 
120  C 
125  C 
130  C 
135  C 
140  C 
145 
150 
155 
160 
165 
170 
175 
180 
185 
190 
195 
200 
205 
210 
215 
220 
225 
230 
235  _ 


INTEGER  MAXT,  TIME.  FLAG 

REAL  VACCEL,  WELOC, VDIST, HACCEL,  HVELOC,  HDIST, 
1  ANGLE, TILT, GRAV,  MASS,  FUEL, FORCE 


INITIALIZATION 


VACCEL  =  0. 
WELOC  *  0. 
VDIST  «  0. 
HACCEL  =  0. 
HVELOC  «  0. 
HDIST  *  0. 

ANGLE  «  0. 

TILT  -  0.  3491 
GRAV  *  32. 

MASS  *  10000. 
FUEL  *  50. 

FORCE  *  400000. 
MAXT  *  200 
FLAG  =  0 
TIME  »  1 


COMPUTATION: 


10 


20 

30 


40 


)  GO  TO  60 
100)  GO  TO  20 
FUEL 

11)  GO  TO  30 


)  CO  TO  30 


IF  (FLAG 
IF  (TIME 
MASS  *  MAS 
IF  (TIME  .  NE. 

ANGLE  -TILT 
CO  TO  30 
IF  (TIME  .  NE. 

FORCE  «  0.  0 
VACCEL  -  ( (FORCE 
WELOC  -  WELOC 
VDIST  »  VDIST  ♦ (V 
HACCEL  «  (FORCE  **5Ttf( ANGLE )> /MASS 
HVELOC  -  HVELOC  +  HACCEL 
HDIST  «  HDIST  +  HVELOC 
TIME  -  TIME  1 
IF  (VDIST  .  GT.  0)  GO  TO  40 
FLAG  -  1 

IF  (TIME  .  LE.  MAXT)  GO  TO  10 
FLAG  -  2 


COS ( ANGLE  > ) /MASS > 


GRAV 
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ji 

•%! 

il 


2  AO 

C 

245 

C 

250 

C 

TERMINATION: 

255 

C 

260 

C 

265 

60 

TIME  *  TIME  -  1 

270 

IF  (VDIST  .  GT.  0)  GO  TO  80 

275 

70 

WRITE16, 3000)  TIME.  HD I ST 

280 

GO  TO  90 

285 

80 

WRITE (6.  4000)  TIME.  MASS.  VACCEL.  VVELOC. 

290 

1 

HACCEL. HVELDC.  HDIST 

295 

90 

CONTINUE 

300 

STOP 

305 

3000 

FORMAT ( 5X.  'ROCKET  HIT  GROUND  AT  TIME«'.I5 

310 

1 

5X<  'HORIZONTAL  DIST  -  '.Fll.2) 

.315 

4000 

FORMAT <5X.  'ROCKET  STILL  ALOFT  AT  TIME  «  ' 

320 

1 

'  SEC0NDSV5X,  'MASS  *  '.F22.  2/ 

325 

2 

5X.  'VERTICAL  ACCEL  =  '.F12.  2/ 

330 

3 

5X.  'VERTICAL  VELOC  =  '.F12.  2/ 

335 

4 

5X»  'VERTICAL  DIST  -  ',F13.  2/ 

340 

5 

5X.  'HORIZONTAL  ACCEL  ■  ',F10.  2/ 

345 

6 

5X.  'HORIZONTAL  VELOC  =  ',F10.  2/ 

350 

7 

5X.  'HORIZONTAL  DIST  «  '.Fll.2) 

355 

END 

VDIST, 


,  15.  'SECONDS 
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INVENTORY  PROGRAM 


5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 


INTEGER  DEL IV,  FLAG,  ITEM,  ONHAND#  ORDER#  RELEV# 
1  REORD#  STORE#  UNFILL 

REAL  GTOTAL#  PRICE#  TOTAL 


INITIALIZATION: 


OPEN  (UNIT*1,  NAME*' ORDERS.  DAT',  TYPE* 'OLD ' > 
OPEN  (UNIT*2,  NAME* 'PURCHAS.  DAT" ,  TYPE* 'OLD 
1  ACCESS* 'SEQUENTIAL ' ) 


COMPUTATION: 


85 

10 

90 

95 

100 

20 

105 

110 

115 

120 

125 

130 

135 

30 

140 

145 

150 

40 

155 

160 

50 

165 

170 

175 

180 

185 

60 

190 

195 

200 

70 

205 

END*80)  STORE 


READ  <1#  100# 

GTOTAL  *  0 

WRITE  (6,  110)  STORE 
READ  (1.  120)  ITEM#  ORDER 

IF  (ITEM  EG.  0)  GO  TO  70 
CALL  FETCH2 ( ITEM#  PRICE#  ONHAND# 
IF  (ONHAND  .  LE.  ORDER)  GO  TO  30 
DEL IV  *  ORDER 
ONHAND  *  ONHAND -  ORDER 

unfill  «  $ _ 

DEL IV  *  - 
ONHAND 

UNFILL  *  0R5EH  -  DEETV 
IF  (ONHAND  .  GT.  RELEV)  GO  TO  50 
IF  (FLAG  .  EQ.  0)  FLAG  *  1 
TOTAL  *  DEL IV  *  PRIC£_j® 

GTOTAL  -  GTOTAL  0TtOTAL 
IF  (FLAG  .  NE.  1 )  GO  TO  60 
WRITE  <2#  130)  ITEM#  REORD 
FLAG  *  2 

WRITE(6,  140)  ITEM,  PRICE#  ORDER. 
CALL  UPDATE  (ITEM,  ONHAND,  FLAG) 
CO  TO  20 

WRITE  (6.  150)  GTOTAL 
GO  TO  10 


RELEV#  REDRD,  FLAG) 


DELIV,  UNFILL.  TOTAL 
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V*.*. 


■  1 


210  C 
215  C 

220  C  TERMINATION: 
225  C 


230  C 

235 

240 

245 

80 

CLOSE 

CLOSE 

STOP 

(UNIT*1 ) 

<UNIT*2) 

250 

100 

FORMAT 

( 12) 

255 

110 

FORMAT 

(//,  5X>  'INVOICE  FOR  STORE  NUMBER 

:  13) 

260 

120 

FORMAT 

(13,  15) 

265 

130 

FORMAT 

(217) 

270 

140 

FORMAT 

( 5X,  'ITEM  NUMBER:  ',  Ill  /  5X, 

275 

1 

'PRICE 

PER  ITEM:  F5.  2  /  5X,  'NUMBER 

ORDERED:  ' 

280 

2 

18.  /5X 

,  'NUMBER  DELIVERED:  ',  16/  5X, 

285 

3 

'UNABLE  TO  DELIVER:  ', I5/5X,  'TOTAL  PRICE: 

FB. 

290 

295 

150 

FORMAT 

END 

(/, 5X,  'TOTAL  PRICE  FOR  ALL  ITEMS: 

4',  F10. 

PJ  PJ 


AIRPORT  PROGRAM 


5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 

90 

95 

100 


135 

140 

145 

150 

155 

160 

165 

170 

175 

180 

185 

190 

195 

200 

205 


INTEGER  ARRQUE,  BEGINT,  CLEAR,  DEPQUE,  ENDT,  MAXWT 
INTEGER  NUMARR,  NUMDEP,  TIME,  TQLWT 
REAL  ARPROB,  DPPROB,  RAND1,  RAN02,  RSEED 


INITIALIZATION: 


RSEED  =  0.  0 
NUMARR  =  0 
NUMDEP  =  0 

CALL  FETCH1 (BEGINT,  ARPROB,  DPPROB, 
CLEAR,  TQLWT ) 

TIME  =  BEGINT 
ENDT  =  BEGINT  +  20 


ARRQUE,  DEPQUE, 


COMPUTATION: 


105 

110 

10 

IF  (TIME  .  GT.  ENDT)  GO 
RAND1  =  RND( RSEED) 

TO 

60 

115 

120 

125 

20 

IF  (RAND1  .  GT.  ARPROB) 
ARRQUE  *  ARRQUE  ♦  1 
RAND2  =  RND( RSEED) 

GO 

TO 

20 

130 

IF  (RAND2  .  GT.  DPPROB) 

GO 

TO 

30 

30 


DEPQUE  =  DEPQUf 
CONTINUE  (lC] 

IF  (CLEAR  .  C; 

IF  (ARRQUE 


«►  1 


IME)  GO  TO  50 
0)  GO  TO  40 


40 

50 

60 


ARRQUE  *  ARRQUE  - 
NUMARR  *  NUMARR  + 
CLEAR  -  TIME  +  3 
CO  TO  50 
IF  (DEPQUE  .  LE. 
DEPQUE  «  DEPQUE 
NUMDEP  «  NUMDEP 
CLEAR  -  TIME  ♦  2 
TIME  *  TIME  «■  1 
GO  TO  10 
MAXWT  ■ 


(ARRQUE*3> 


(DEPGUE*2> 
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!  '  1 


210 

C 

215 

C 

220 

C 

TERMINATION: 

225 

C 

230 

C 

235 

WRITE  (6.  100)  ENDT >  ARRGUE,  NUMARR,  DEPQUE. 

240 

1 

NUMDEP, MAXWT 

245 

IF  (MAXWT  .  GT.  TOLWT  >  GO  TO  70 

250 

WRITE  (6,  120) 

255 

GO  TO  80 

260 

70 

WRITE  (6,  110) 

265 

80 

CONTINUE 

270 

STOP 

275 

100 

FORMAT  < 6X >  'ENDING  TIME  FOR  SIMULATION:  15,  / 

280 

1 

12X,  'ARRIVAL  QUEUE:  ',  I5/UX,  'NUMBER  ARRIVED: 

285 

1 

10X,  'DEPARTURE  QUEUE:  I5/10X,  'NUMBER  DEPARTED 

290 

1 

15/  13X,  'MAXIMUM  WAIT: ',  15,  '  MINUTES') 

295 

110 

FORMAT  ( 5X ,  'OPEN  ANOTHER  RUNWAY') 

300 

120 

FORMAT  <5X.  'ANOTHER  RUNWAY  NOT  NEEDED') 

305 

END 
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